Application Aware for Byzantine Fault Tolerance
نویسندگان
چکیده
Driven by the need for higher reliability of many distributed systems, various replication-based fault tolerance technologies have been widely studied. A prominent technology is Byzantine fault tolerance (BFT). BFT can help achieve high availability and trustworthiness by ensuring replica consistency despite the presence of hardware failures and malicious faults on a small portion of the replicas. However, most state-ofthe-art BFT algorithms are designed for generic stateful applications that require the total ordering of all incoming requests and the sequential execution of such requests. In this dissertation research, we recognize that a straightforward application of existing BFT algorithms is often inappropriate for many practical systems: (1) not all incoming requests must be executed sequentially according to some total order and doing so would incur unnecessary (and often prohibitively high) runtime overhead; and (2) a sequential execution of all incoming requests might violate the application semantics and might result in deadlocks for some applications. In the past four and half years of my dissertation research, I have focused on designing lightweight BFT solutions for a v number of Web services applications (including a shopping cart application, an event stream processing application, Web service business activities (WS-BA), and Web service atomic transactions (WS-AT)) by exploiting application semantics. The main research challenge is to identify how to minimize the use of Byzantine agreement steps and enable concurrent execution of requests that are commutable or unrelated. We have shown that the runtime overhead can be significantly reduced by adopting our lightweight solutions. One limitation for our solutions is that it requires intimate knowledge on the application design and implementation, which may be expensive and error-prone to design such BFT solutions on complex applications. Recognizing this limitation, we investigated the use of Conflict-free Replicated Data Types (CRDTs) to construct highly concurrent Byzantine fault tolerance systems, which does not require exploiting too many application semantics since all operations are commutative.
منابع مشابه
Design and implementation of a Byzantine fault tolerance framework for non-deterministic applications
State-machine-based replication is an effective way to increase the availability and dependability of mission-critical applications. However, all practical applications contain some degree of non-determinism. Consequently, ensuring strong replica consistency in the presence of application non-determinism has been one of the biggest challenges in building dependable distributed systems. In this ...
متن کاملHosting Byzantine Fault Tolerant Services on a Chord Ring
In this paper we demonstrate how stateful Byzantine Fault Tolerant services may be hosted on a Chord ring. The strategy presented is fourfold: firstly a replication scheme that dissociates the maintenance of replicated service state from ring recovery is developed. Secondly, clients of the ring based services are made replication aware. Thirdly, a consensus protocol is introduced that supports ...
متن کاملTowards a Fault-aware Computing Environment
In this paper, we propose and present the design and initial development of the Fault awareness Enabled Computing Environment (FENCE) system for high end computing. FENCE is a comprehensive fault management system in the sense that it consists of both post and runtime analysis, integrates both proactive and reactive mechanisms, and combines both application level and system level fault manageme...
متن کاملOn the Gaussian error function
Many steganographers would agree that, had it not been for the Ethernet, the exploration of model checking might never have occurred. After years of essential research into web browsers, we verify the improvement of redundancy, which embodies the confusing principles of cryptoanalysis. RIBAND, our new application for the deployment of Byzantine fault tolerance, is the solution to all of these o...
متن کاملOn Byzantine Containment Properties of the min + 1 Protocol
Self-stabilization is a versatile approach to fault-tolerance since it permits a distributed system to recover from any transient fault that arbitrarily corrupts the contents of all memories in the system. Byzantine tolerance is an attractive feature of distributed systems that permits to cope with arbitrary malicious behaviors. We consider the well known problem of constructing a breadth-first...
متن کامل